Rationalization of data used in model of time varying event behavior

ABSTRACT

A method for rationalization of data used to model a time-variant behavior provides advantages in that storage requirements for such data are reduced and accuracy of detection of events in the behavior is increased. The method uses labels added to training data to indicate whether that data relates to recent events or not. A classifier is generated from the labelled training data. By removing old data which the classifier would classify differently were the old data re-labelled as new, a selective purging of the old training data takes place each time new training data becomes available. The method is especially useful in detecting fraudulent use of, or faults in, a communications network.

This application is the U.S. national phase of international applicationPCT/GB00/04597 filed 1 Dec. 2000 which designated the U.S.

BACKGROUND

1. Field of the Invention

The present invention relates to a method of rationalising data storedin a physical form.

It has particular utility in relation to rationalising data thatreflects a time-variant behaviour.

2. Related Art

In many cases, in order to model a time-variant behaviour it isnecessary to gather training data over the course of time. Normally, thetraining data comprises a plurality of examples, each of which providesvalues of a plurality of parameters, which values characterise thatexample. The examples reflect the time-variant behaviour that is to bemodelled.

Usually, a time-variant behaviour is modelled by running a computerprogram which controls the computer to output predicted values of one ofthe predicted parameters given an incomplete example that only providesvalues for the other parameters. Where the parameter whose value isbeing sought takes a few discrete values (or falls within one of a fewdiscrete value ranges) then the model can be said to provide aclassification of the incomplete example.

Conventionally, one of two approaches is used in gathering data over thecourse of time in order to model a time-variant behaviour.

Firstly, data can simply be accumulated over time. The disadvantage ofthis approach is that after any change in the time-variant behaviour thetraining data includes examples that reflect aspects of the time-variantbehaviour that no longer subsist. The resulting increase in theproportion of the training data which is no longer applicable leads tothe training data reflecting the time-variant behaviour less accurately.This results in any models that are produced on the basis of thetraining data also becoming less accurate.

Secondly, existing training data can be frequently replaced by trainingdata relating to more recent events. However, if the behaviour that isbeing modelled includes rare events that are of interest (as is the casein relation to fraudulent calls, or failed calls in telephone networks,for example) then the paucity of data relating to such events results inthe model being unsatisfactorily inaccurate.

The skilled person has therefore, up until the advent of the presentinvention, been faced with a trade-off. On the one hand, if he or sheaccumulates training data over time then models based on the trainingdata lack adaptability. On the other hand, if he or she frequentlyreplaces the training data then the accuracy of the model is limited.

BRIEF SUMMARY

According to a first aspect of the present invention, there is provideda method of rationalising training data stored in a physical form,

wherein said training data comprises an extant set of training examples,each member of said extant set providing values for a newness parameterand one or more other parameters, the value of said newness parameterindicating that the event to which that training example relatesoccurred before a predetermined time;

said method comprising the steps of:

gathering an update set of training examples, each member of said updateset providing values for said newness parameter and said one or moreother parameters, the value of said newness parameter indicating thatthe event to which that training example relates occurred after saidpredetermined time;

analysing said extant set and said update set to generate a classifierwhich is able to classify training examples from both said extant setand said update set, the classification of the training examples beingdependent on the value of said newness parameter; and

on the basis of the generated classifier, selecting a surviving set oftraining examples from said extant set.

By generating a classifier which is able to classify training exampleson the basis of whether they occurred before or after a predeterminedtime, and then using that classifier to remove only selected extanttraining examples, extant training examples which relate to aspects ofthe time-variant behaviour that ceased before the predetermined time areremoved whereas as those that relate to behaviours that subsisted afterthe predetermined time remain.

In comparison to known methods of accumulating training data over time,the removal of training data that is no longer relevant after a changein the time-variant behaviour means that less storage capacity isrequired for the training data.

In comparison to the frequent replacement of training data, themaintenance of training data that is still relevant despite a change inthe time-variant behaviour means that the detection or prediction ofevents in the time-variant behaviour is improved.

Preferably, said method further comprises the steps of:

generating said extant set by gathering training examples by adding anewness parameter value to each of a first set of incomplete extantexamples, each of which provides values for said one or more otherparameters, said newness parameter value indicating that said examplesrelate to events that occurred before said predetermined time;

generating said update set by adding a newness parameter value to eachof a second set of incomplete update examples, each of which providesvalues for said one or more other parameters, said newness parametervalue indicating that said examples relate to events that occurred aftersaid first predetermined time.

This enables the method to be used on data that does not include timeparameters.

In preferred embodiments, the method further comprises the steps of:

combining said surviving set and said update set to provide a purged setof one or more training examples; and

removing the newness parameter value from each member of said purged setto generate an incomplete purged set of one or more incomplete trainingexamples.

This has the advantage that the storage capacity required for the datais reduced still further.

The method of the invention can be repeated each time that training datareflecting recent changes in the time-variant behaviour becomesavailable. Accordingly, in some embodiments of the present invention,the method further comprises the steps of:

-   -   adding a newness parameter value to each of said incomplete        training examples of said purged set, which newness parameter        value indicates that said examples relate to events that        occurred before a second predetermined time, thereby forming a        new set of training examples;    -   repeating steps according to the first aspect of the present        invention, treating said new set of training examples as said        extant set, and said second predetermined time as said        predetermined time.

Many different types of classifiers may be used in the presentinvention. In preferred embodiments, said classifier generation stepcomprises:

analysing said training data to generate representations of logicalrules, each of which rules comprises one or more criteria relating torespective ones of said one or more other parameters and a correspondingconclusion, one or more of said rules including a newness criterionwhich is met for those examples in which the value of said newnessparameter indicates that said event occurred after said firstpredetermined time; and

said surviving set forming step comprises:

identifying a subset of said logical rules that include a requirementthat said newness criterion is not met as outdated rules; and

removing at least some of those training examples which meet all thecriteria of one or more of said outdated rules.

The use of a rule-based classifier provides a straightforward method forselecting data which would be classified differently were the value oftheir newness parameter to be altered to indicate that they relate to anevent which occurred before said predetermined time. That results in areduction in the processing power required to implement the invention.

In some embodiments, said removal step comprises removing all of thosetraining examples which meet all the criteria of one or more of saidoutdated rules.

The extant training examples that would be classified differently werethe value of their newness parameter to be altered to indicate that theyrelate to an event which occurred before said predetermined time willform a subset of the extant training examples which the classifierclassifies using outdated rules. By removing all such extant trainingexamples, it follows that the subset is removed, together with someother training examples. Although this method is less selective thanthose of the first set of preferred embodiments mentioned below, it isless complex and hence requires less processing power to implement it.

In a first set of preferred embodiments, said surviving set forming stepfurther comprises:

-   -   identifying rules having a requirement that said newness        requirement is met as new rules;    -   generating rationalised new rules by removing said newness        criterion;    -   identifying rules having no newness criterion as surviving        rules;    -   forming an up-to-date set of rules by combining said surviving        rules and said new rules;    -   classifying said training example into a first class on the        basis of said outdated rules; and    -   classifying said training example into a second class on the        basis of said up-to-date rules;    -   removing said training example from said extant set if said        first and second classes differ.

In this first set of preferred embodiments, instead of removing allthose training examples that conform with outdated rules, only aselection of those training examples are removed. This is found to bebetter able to remove only those training examples that reflect aspectsof the time-variant behaviour that have ceased.

In preferred embodiments of the present invention, said logical rulesare arranged as a decision-tree. Preferably, the tree-building algorithmuses a criteria selection heuristic which chooses at each stage thelocally most informative attribute. This might be based on the entropyor any number of different information measures. As will be understoodby those skilled in the art, placement of relevant (i.e. informative)criteria high in the tree facilitates the removal of those criteria thatare least informative by so-called ‘post pruning’ of the tree.

This has the advantage that the position of the newness criterion in thedecision tree allows the method to distinguish easily between variationsin the data that merely represent random variations in the time-variantbehaviour and those variations that represent a significant change inthe time-variant behaviour.

The method of the present invention can be incorporated into a method ofclassifying an unclassified example. Any classifier may be generatedusing the rationalised training data. However, in a particularlyadvantageous embodiment of this aspect of the present invention, saidmethod of classifying an unclassified example comprises generating apurged set of training examples in accordance with one of theabove-mentioned first set of preferred embodiments of the presentinvention and classifying said unclassified example using saidup-to-date rules.

In this way the amount of processing power used in generating theclassifier is reduced.

Other aspects of the present invention are defined in the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

There now follows, by way of example only, a description of specificembodiments of the present invention. This description refers to theaccompanying drawings, in which:

FIG. 1 is a schematic illustration of components of a personal computerwhich may be operated to provide a call entry classifier in accordancewith first and second embodiments of the present invention;

FIG. 2 is a flow chart showing the steps undertaken in classifying callentries in a call data file as suspicious or unsuspicious in accordancewith first and second embodiments of the present invention;

FIG. 3 shows a few call entries from a file containing extant trainingdata;

FIG. 4 shows a few call entries from a file containing new trainingdata;

FIG. 5 shows a few call entries from a file containing labelled trainingdata, each call entry being labelled with its training status;

FIG. 6 is a flow chart showing the purging step of the method of FIG. 2carried out in accordance with a first embodiment of the presentinvention; and

FIG. 7 is a flow chart showing the purging step of the method of FIG. 2carried out in accordance with a second embodiment of the presentinvention.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

FIG. 1 shows a personal computer which comprises well-known hardwarecomponents connected together in a conventional manner. The well-knownhardware components comprise a central processing unit 10, random accessmemory 12, read-only memory 14, a hard-disc 16 and input/output devices18,20,22,24, and 26. The hardware components are interconnected via oneor more data and address buses 28. The input/output devices comprise amonitor 18, a keyboard 20, a mouse 22, a CD ROM drive 24 and a networkcard 26. The network card is connected to a server computer 30 by thepublic Internet 32.

In accordance with a first embodiment of the present invention, a useruses the computer to analyse data concerning recent calls made bycustomers of a telecommunication network operator and thereby toidentify a subset of those calls which are more likely to be fraudulentthan the remainder. By identifying those calls that are more likely tobe fraudulent, the network operator can investigate those calls and moreeffectively prevent future occurrences of similar frauds.

The user begins by loading a program from a compact disc CD1 into thecomputer's RAM 12 and running that program.

The steps of the method for identifying calls that are likely to befraudulent which are carried out by the computer under control of thatprogram are set out in FIG. 2. The program first prompts the user toload an extant training data file 402 from a compact disc CD2 into thecomputer's RAM 12. The extant training data file 402 contains a set ofcall entries relating to calls that are already known to be fraudulentor legitimate, all of which relate to calls made before, for example,February 1999. As is illustrated in FIG. 3 (which shows a few examplesof call entries in the extant training data file 402), each call entryin the data file gives a number of parameters associated with a callmade by a customer. Those parameters include:

-   a) the area code for the source of the call;-   b) the country code for the destination of the call;-   c) the area code for the destination of the call;-   d) the duration of the call in minutes;-   e) the rate at which the customer was charged for the call;-   f) the day of the week that the call was made;-   g) the time of day that the call was made; and-   h) a flag indicating whether the call was legitimate or fraudulent.

In step 403, the computer is controlled by the program to add a trainingstatus label to each of the call entries in the pre-February 1999training data file 402 to indicate that the call entry is ‘extant’.

Thereafter, the computer prompts the user to load a new training datafile 404 (this might for example contain details of calls made during,say, February 1999) from a third compact disc CD3 into the RAM 12. Thiscompact disc contains a plurality of call entries of a similar format tothe call entries in the pre-February 1999 training data file 402.

Then in step 406, the computer is controlled by the program to add atraining status label to each of the call entries in the February 1999training data file 404 indicating that the call entry is ‘new’.

The computer then merges (step 408) the pre-February 1999 and February1999 training data files together to form a labelled pre-March 1999training data file 412. Some examples of call entries in the labelledfile are shown in FIG. 5. It will be seen that, in addition to each ofthe call parameters (a) to (h) mentioned above, each call entry includesa ‘Training status’ label (i).

Then, the computer generates, based on the labelled pre-March 1999training data file 412, a set of rules contained in a rules file 416. Asis known to those skilled in the art, each of these rules comprises aset of criteria and a likely conclusion if those criteria are satisfied.In the present example, the rules give a conclusion as to whether a callis suspicious or unsuspicious based on one or more criteria which aredependent upon respective call parameters a) to h).

Those skilled in the art will be able to write and run a suitablerule-generation program, but could alternatively use a commerciallyavailable program. For example, the user might purchase the See5 dataanalysis program from RuleQuest Research Pty Ltd, 30 Athena Avenue, StIves NSW 2075, Australia. Alternatively, the program can be downloadedin a known manner by running a browser program on the computer, browsingthe file found on the www at rulequest dot com, and then following thehyperlinks as instructed.

Thus, the computer generates a derived rules file 416 from the labelledpre-March 1999 training data file.

For example, given the labelled pre-March 1999 training data file 412,the computer running under the control of the program might output thefollowing set of rules.

-   Rule 1: IF dest'n country code is (44) THEN call is UNSUSPICIOUS;-   Rule 2: IF dest'n country code is NOT (44) AND duration is (<30 min)    THEN call is UNSUSPICIOUS;-   Rule 3: IF dest'n country code is NOT (44) AND call duration is NOT    (<30 min) AND training status is (new) AND dest'n country code    is (72) THEN call is SUSPICIOUS;-   Rule 4: IF dest'n country code is NOT (44) AND call duration is NOT    (<30 min) AND training status is (new) AND dest'n country code is    NOT (72) THEN call is UNSUSPICIOUS;-   Rule 5: IF dest'n country code is NOT (44) AND call duration is NOT    (<30 min) AND training status is (extant) AND dest'n country code    is (55) THEN call is SUSPICIOUS;-   Rule 6: IF dest'n country code is NOT (44) AND call duration is NOT    (<30 min) AND training status is (extant) AND dest'n country code is    NOT (55) THEN call is UNSUSPICIOUS.

In step 418, the program controls the computer to select the rules whichinclude the criterion ‘training status is (extant)’. In the aboveexample, rules 5 and 6 are therefore selected. Each of the selectedrules then has the criterion ‘training status is (extant)’ removed toprovide an outdated rule. The outdated rules are then stored in anoutdated rules file 420. In the present case, the outdated rules file420 would contain the following outdated rules:

-   Outdated Rule 1: IF dest'n country code is NOT (44) AND call    duration is NOT (<30 min) AND dest'n country code is (55) THEN call    is SUSPICIOUS;-   Outdated Rule 2: IF dest'n country code is NOT (44) AND call    duration is NOT (<30 min) AND dest'n country code is NOT (55) THEN    call is UNSUSPICIOUS.

In step 422 the pre-February 1999 training data file 402 is processed toremove call entries which meet all the criteria (i.e. all the conditionsbefore the word ‘THEN’) of one or more of the outdated rules. In moredetail, a flow chart illustrating the steps which the computerundertakes in order to purge the pre-February 1999 training data file402 in accordance with a first embodiment of the present invention isshown in FIG. 6.

Firstly, in step 602, the computer is controlled to check whether thereare any outdated rules in the outdated rules file 420. If there are nooutdated rules, then the purging step 422 ends at step 603.

If there are one or more outdated rules then, at step 604, an outer loopcounter (m) is initialised to one. An outer group of operations(606–654) are then carried out.

The outer group of instructions begins with the setting of an inner loopcounter (n) to one (step 606).

Thereafter, an inner group of instructions (608, 610, 650) is carriedout. The inner group of instructions begins with a test (step 608) toestablish whether every one of the one or more criteria of the mth rulein the outdated rule file 420 is true for the nth call entry in theextant training data file 402. If all those criteria are true for thenth call entry, then that call entry is deleted in step 610 (withoutaltering the number n associated with the call entries that follow it).If the mth outdated rule does not provide a conclusion for the ruleentry then that call entry is maintained in the pre-February 1999training data file 402. Following the deletion or maintenance of the nthcall entry, an inner loop termination test (step 650) is carried out tosee whether the nth call entry in the pre-February 1999 training datafile is the last call entry. If it is not then n is incremented by onein step 652 and the inner group of instructions (608, 610, 650) isrepeated.

When the inner loop termination test (step 650) finds that the last callentry has been reached, an outer loop termination test (step 654) isthen carried out.

The outer loop termination test (step 654) finds whether the outer loopcounter is equal to the number of rules (M) contained in the outdatedrules file 420. If the loop counter is not yet equal to the number ofrules (M) contained in the outdated rules file 420 then the outer loopcounter m is incremented by one (step 656) and the outer group ofinstructions (606–654) is repeated for the following outdated rule.

When the loop counter does reach the number of rules (M) contained inthe outdated rules file 420 then the purging process 422 ends (step658).

After the purging process 422, the remaining call entries are processedto remove the training status (i) and thereby to form an updatedpre-March 1999 training data file 424.

The user can then execute any known classifier generation program on thecomputer to provide a classifier 428 based on the pre-March updatedtraining data file 424. By way of example, the classifier generationprogram might be based on decision tree algorithms (e.g. the See5program mentioned above), or neural net algorithms. Since the updatedpre-March 1999 training data file does not contain training status data,if the classification step 426 produces a set of rules, then those ruleswill not contain any criteria relating to the training status of theexamples.

Having generated the classifier the user can then load incompleteFebruary 1999 call data 430 from CD4. The incomplete February 1999 calldata contains call entries for which it is not known whether the entryrelates to a legitimate call or not. The user then runs the classifier(step 432) to create a file of suspicious call entries (434) containingidentifiers of recent calls that should be regarded as suspicious andhence subjected to further investigation.

It will be seen that the amount of the training data that must be storedin order to enable the computer to generate a classifier in step 426 isless than the total contents of the extant training data file 402 andthe new training data file 404. Furthermore, despite the reduction intraining data, it will be realised that the accuracy of the classifiergenerated in step 426 is not significantly reduced in the absence of achange in the fraudulent use of the network in February 1999. However,if such a change were to occur then the above embodiment avoids thesignificant degradation in performance that would result from using allthe pre-February 1999 and February 1999 data.

In another embodiment, the program controls the computer to classifyeach of the examples in the incomplete February 1999 call data 430 usinga modified set of the valid rules produced in the rule derivation step414. The rules are modified by removing any criteria which relate to thetraining status parameter. Thus, in the above example, the modified setof valid rules would be:

-   Valid Rule 1: IF dest'n country code is (44) THEN call is    UNSUSPICIOUS;-   Valid Rule 2: IF dest'n country code is NOT (44) AND duration is    (<30 min) THEN call is UNSUSPICIOUS;-   Valid Rule 3 (modified): IF dest'n country code is NOT (44) AND call    duration is NOT (<30 min) AND dest'n country code is (72) THEN call    is SUSPICIOUS;-   Valid Rule 4 (modified): IF dest'n country code is NOT (44) AND call    duration is NOT (<30 min) AND dest'n country code is NOT (72) THEN    call is UNSUSPICIOUS.

This has the further advantage that the processing necessary to generatethe classifier in step 426 is substantially reduced.

In yet another embodiment of the present invention, a similarly modifiedset of valid rules is produced in step 418, the purging step 422 thenbeing carried out in the manner illustrated in FIG. 7.

Firstly, in step 702, the computer is controlled to check whether thereare any outdated rules in the outdated rules file 420. If there are nooutdated rules, then the purging step 422 ends at step 703.

If there are one or more outdated rules then, at step 704, an outermostloop counter (m) is initialised to one.

An outermost group of operations (706–756) are then carried out.

The outermost group of instructions begins with the setting of anintermediate loop counter to one (step 706).

This is followed by the carrying out of an intermediate group ofinstructions (708–752) which begins with a test (step 708) to establishwhether every one of the one or more criteria of the mth rule in theoutdated rule file 420 is true for the nth call entry. If the criteriaof the mth outdated rule do apply to the nth call entry then a callentry handling routine (steps 712 to 724) is carried out.

Otherwise, an intermediate loop termination test (step 750) is carriedout to find whether the intermediate loop counter equals the number ofcall entries in the pre-February 1999 data file. If it does not, thenthe intermediate loop counter (n) is increased by one in step 552 andthe innermost group of instructions is repeated.

On the other hand, if the intermediate loop termination test (step 750)finds that the intermediate loop counter does not yet equal the numberof call entries in the pre-February 1999 data file, a further test iscarried out. The further test involves testing whether the outermostgroup of instructions (706–752) has been carried out for each of the moutdated rules in the outdated rule file 420. If the outermost group ofinstructions has not been carried out for each outdated rule, then theoutermost loop counter (m) is increased by one in step 756 and theoutermost group of instructions (706–752) is repeated. If the outermostgroup of instructions has been carried out for each outdated rule, thenthe purging process ends (step 758).

The call entry handling routine (steps 712–724) mentioned above beginsby storing the conclusion obtained by applying the mth outdated rule tothe nth call entry (step 712).

An appropriate-valid-rule-identification routine (step 716,718) thenbegins with the initialisation of an innermost loop counter (p) to one(step 714). This is followed by a test (step 716) to find whether it istrue that every one of the criteria in the pth rule in the modifiedvalid rules file is true for the nth call entry in the pre-February 1999training data file 402. If true, theappropriate-valid-rule-identification routine (step 716, 718) ends. Theinnermost loop counter p is increased by one in step 718 and theappropriate-valid-rule-identification routine (steps 716,718) isrepeated.

The routine finds in which of the rules in the modified valid rulesfile, all the criteria are true for the nth call entry in thepre-February 1999 training data 402. Thereafter, the conclusion of therule is found and stored (step 720).

Once the new conclusion has been stored in step 720, the old conclusion(obtained in step 712) is compared to the new conclusion. If the two arethe same, then the call entry is maintained in the labelled trainingdata file 412. If the two are different then the call entry is deleted(step 724). In either case, the call entry handling routine then ends.

Once the process of FIG. 7 has been carried out, the training statuslabels are removed from the examples remaining in the labelled trainingdata file 412 to provide an up-to-date training data file 424.

The process then continues as described in relation to the firstembodiment.

The third embodiment is preferred to the first because it is found toproduce training data which models a time-variant behaviour moreaccurately than does the training data produced by the first embodiment.

In a fourth embodiment of the present invention, labelled training data412 is produced as it is in the above embodiments. Then, step 414 of theabove method is replaced by the generation of a neural net which has thetraining status as one of its inputs and an output which indicateswhether the call entry is to be regarded as suspicious or unsuspicious.The neural net is trained in a conventional manner using the labelledtraining data 412.

The labelled training data is then purged as follows. First, theclassification predicted by the neural net for each labelled extant callentry is stored. Then the extant call entries are relabelled as ‘new’.Again, the classification predicted by the neural net is found andstored. Only those extant call entries that are classified similarly inthe two cases are recorded in the up-to-date training data 424.

The above embodiments will be of use in many applications. Nevertheless,they are especially useful in relation to some applications in which thedetection of an event of a certain type will itself result in a changein the time-variant behaviour being investigated. For example, when usedin the detection of fraudulent use of, or faults in, a communicationsnetwork, the improved detection will result in a change in thetime-variant behaviour being modelled (assuming the fraudulent use orfault will cease once detected).

1. A computer-implemented method for operating data processing apparatusto rationalize stored training data, said training data being suitablefor generating a model of a time-valiant behavior wherein said trainingdata includes an extant set of training examples, each member of saidextant set providing values for a newness parameter and one or moreother parameters, the value of said newness parameter indicating thatthe event to which that training example relates occurred before apredetermined time; said method comprising: (a) gathering an update setof training examples, each member of said update set providing valuesfor said newness parameter and said one or more other parameters, thevalue of said newness parameter indicating that the event to which thattraining example relates occurred after said predetermined time; (b)analyzing said extant set and said update set to generate a classifierwhich classifies training examples from both said extant set and saidupdate set, the classification of the training examples being a functionof a plurality of parameters which include said newness parameter; and(c) forming a surviving set of training examples by selecting, on thebasis of the generated classifier, a surviving set of training examplesfrom said extant set.
 2. A method as in claim 1 further comprising:generating said extant set by gathering training examples by adding anewness parameter value to each of a first set of incomplete extantexamples, each of which provides values for said one or more otherparameters, said newness parameter value indicating that said examplesrelate to events that occurred before said predetermined time; andgenerating said update set by adding a newness parameter value to eachof a second set of incomplete update examples, each of which providesvalues for said one or more other parameters, said newness parametervalue indicating that said examples relate to events that occurred aftersaid first predetermined time.
 3. A method as in claim 1 furthercomprising combining said surviving set and said update set to provide apurged set of one or more training examples.
 4. A method as in claim 3further comprising removing the newness parameter value from each memberof said purged set to generate an incomplete purged set of one or moreincomplete training examples.
 5. A computer-implemented method ofclassifying an unclassified example of time-variant behavior, saidmethod comprising: generating a purged set of training examples inaccordance with claim 3; generating a classifier based on said purgedset of training examples; and classifying said unclassified exampleusing said classifier.
 6. A method as in claim 4 further comprising:adding a newness parameter value to each of said incomplete trainingexamples of said purged set, which newness parameter value indicatesthat said examples relate to events that occurred before a secondpredetermined time, thereby forming a new set of training examples; andrepeating steps (a), (b) and (c) treating said new set of trainingexamples as said extant set, and said second predetermined time as saidpredetermined time.
 7. A method as in claim 1 wherein said forming step(c) comprises forming a surviving set of training examples which thegenerated classifier would classify in the same way were the value ofsaid newness parameter to be changed to indicate that the event to whichthe example relates occurred after said predetermined time.
 8. A methodas in claim 1 wherein: step (b) comprises analyzing said training datato generate representations of logical rules, each of which rulescomprises one or more criteria relating to respective ones of said oneor more other parameters and a corresponding conclusion, one or more ofsaid rules including a newness criterion which is met for those examplesin which the value of said newness parameter indicates that said eventoccurred after said first predetermined time and step (c) comprisesidentifying a subset of said logical rules that include a requirementthat said newness criterion is not met as outdated rules; and removingat least some of those training examples which meet all the criteria ofone or more of said outdated rules.
 9. A method as in claim 8 whereinsaid removing sub-set of step (c) comprises removing all of thosetraining examples which meet all the criteria of one or more of saidoutdated rules.
 10. A method as in claim 8 wherein step (c) furthercomprises: identifying rules having a requirement that said newnessrequirement is met as new rules; generating rationalized new rules byremoving said newness criterion; identifying rules having no newnesscriterion as surviving rules; forming an up-to-date set of rules bycombining said surviving rules and said new rules; classifying saidtraining example into a first class on the basis of said outdated rules;classifying said training example into a second class on the basis ofsaid up-to-date rules; and removing said training example from saidextant set if said first and second classes differ.
 11. A method ofclassifying an unclassified training example, said method comprising:generating a purged set of training examples in accordance with claim 10and classifying said unclassified example using said up-to-date rules.12. A method as in claim 8 wherein said logical rules are arranged as adecision-tree.
 13. Apparatus comprising an input device, memory, anoutput device, and a data processing unit, each of said devices beingconnected in operation to said data processing unit, wherein said memorystores: a) extant training data input code executable by said dataprocessing unit to cause extant training data representations to be alsorepresented at predetermined locations in said memory, said extanttraining data comprising a plurality of extant training examples, eachextant training example providing values for a newness parameter and aset of one or more other parameters, the value of said newness parameterindicating that said event occurred before a predetermined time; b)update training data input code executable by said data processing unitto cause update training data representations presented to said inputdevice to be also represented at further predetermined locations in saidmemory, said update training data comprising a plurality of updatetraining examples, each update training example providing values for anewness parameter and a set of one or more other parameters, the valueof said newness parameter indicating that said event occurred after apredetermined time; c) a classifier generation code executable by saiddata processing unit to generate a classifier which is able to classifytraining examples from both said extant set and said update set, theclassification of the training examples being dependent on the value ofone or more of said one or more other parameters and said newnessparameter; and d) training data purging code executable by said dataprocessing unit to remove from said extant training data at least someof those examples which the generated classifier would classifydifferently were the value of said newness parameter to be changed toindicate that the event to which the example relates occurred after saidpredetermined time.
 14. A computer-implemented method of operating adata processing apparatus to process classifier training data to providepurged classifier training data, said data processing apparatuscomprising an input device, an output device, memory and a dataprocessing unit, said method comprising: storing training data in saidmemory, said training data comprising: (a) an extant set of trainingexamples, each member of said extant set providing values for a newnessparameter and one or more other parameters, the value of said newnessparameter indicating that the event to which that training examplerelates occurred before a predetermined time; and (b) an update set oftraining examples, each member of said update set providing values forsaid newness parameter and said one or more other parameters, the valueof said newness parameter indicating that the event to which thattraining example relates occurred after said predetermined time;operating said data processing unit to process said training data togenerate a classifier which is able to classify training examples fromboth said extant set and said update set, the classification of thetraining examples being dependent on the value of one or more of saidone or more other parameters and said newness parameter; and operatingsaid data processing unit to remove from said extant set those trainingexamples which the generated classifier would classify differently werethe value of said newness parameter to be changed to indicate that theevent to which the example relates occurred after said predeterminedtime.
 15. A computer program product comprising a machine readablemedium storing a computer program of instructions executable to performmethod steps according to claim 1.